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Resumen 

Por las simulaciones por elementos finitos (EF) para el analisis dinamico de vigas 
ferroviarias sobre fundaciones amortiguadas por friccion requirieren (i) mucho tiempo, 
bien como (ii) conocimiento y software avanzados que ultrapasan los recursos 
disponibles en empresas de diseno en Ingenieria Civil, este articulo ensena el potencial 
de las Redes Neuronales Artificiales (ANN en ingles) para predecir de forma efectiva 
los desplazamientos maximos y la velocidad critica en vigas ferroviarias sobre accion 
de fuerzas moviles. Cuatro modelos basados en ANN son propuestos, uno por cada 
rango de velocidades ([50, 175] U [250, 300] m/s; ] 175, 250[ m/s) y por cada tipo de 
desplazamiento (para arriba o para bajo). Cada modelo es funcion de dos variables 
independientes, el parametro friccional y la velocidad de la fuerza. Entre todos los 
modelos y los 663 datos utilizados, ha sido obtenido un error maximo de 5.4 % cuando 
comparadas las soluciones basadas en ANN y EF. Mientras la ultima tiene asociado 
un tiempo promedio de computacion, por cada punto/dato, de miles de segundos, 
la anterior ni un milisegundo necesita. Este estudio fue un paso importante hacia 
el desarrollo de metodos analiticos basados en ANN que sean mas versatiles (i.e., 
incluyendo mas tipos de variables independientes) para el mismo tipo de problema. 

Palabras clave: Redes Neuronales Artificiales; Vigas Ferroviarias; Fundaciones Amortiguadas 
por Friccion; Fuerzas Moviles; Desplazamientos Maximos; Velocidad Critica 

Abstract 

Since the use of finite element (FE) simulations for the dynamic analysis of railway 
beams on frictionally damped foundations are (i) very time consuming, and (ii) require 
advanced know-how and software that go beyond the available resources of typical 
civil engineering firms, this paper aims to demonstrate the potential of Artificial Neural 
Networks (ANN) to effectively predict the maximum displacements and the critical 
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velocity in railway beams under moving loads. Four ANN-based models are proposed, 
one per load velocity range ([50, 175] U [250, 300] m/s; ] 175, 250[ m/s) and per 
displacement type (upward or downward). Each model is function of two independent 
variables, a frictional parameter and the load velocity. Among all models and the 663 
data points used, a maximum error of 5.4 % was obtained when comparing the ANN- 
and FE-based solutions. Whereas the latter involves an average computing time per data 
point of thousands of seconds, the former does not even need a millisecond. This study 
was an important step towards the development of more versatile (i.e., including other 
types of input variables) ANN-based models for the same type of problem. 

Keywords: Artificial Neural Networks; Railway Beam; Frictionally Damped Foundation; 
Moving Load; Maximum Displacements; Critical Velocity. 


INTRODUCTION 

The study of the dynamic behavior of beams on foundations subjected to moving loads 
with possible applications in high-speed railway track design has been a topic of interest in 
the literature. In particular, the existence of a critical velocity ofthe load for which the beam's 
oscillation amplitudes become very large has been demonstrated [1].The serviceability of 
a high-speed railway track depends on the limitation of these dynamic amplifications [2- 
3], that is, depends on the ability of its substructure to dissipate the energy transmitted 
by the moving loads. This substructure is composed of many stones of several sizes and 
shapes, interacting through surfaces that are almost always in persistent frictional contact. 

Since the main mechanism governing the interaction between the infrastructure's 
constituents is unilateral frictional contact mechanics, a novel non-smooth foundation 
model, which is closer to the true frictional dissipative nature of the ballast than the 
viscous model, was proposed in [4]. The goal of that study was to generalize, for more 
realistic behaviors, the analyses in [5-7] so that they could be of interest for high-speed 
railway engineers. Thus, a finite element (FE) program was developed within a MATLAB 
[8] environment to analyze the dynamic behavior of a Euler-Bernoulli beam on a 
foundation composed of continuous distributions of linear elastic springs and Coulomb 
frictional dissipators/dampers. This "Winkler-Coulomb"-type foundation is represented 
in Fig. 1 under a simply supported beam. It assumes that, in parallel with a linear elastic 
Winkler foundation, there is a reaction per unit length that, at each cross section ofthe 
beam, obeys to Coulomb's friction law: the frictional dissipators apply an instantaneous 
reaction per unit length at cross section x and time instant t depending on the sign of 
the transverse velocity of that cross section, i.e. where (i) f denotes the maximum force 
per unit length that the system of frictional dissipators ofthe foundation may support, 
(ii) is the transverse velocity ofthe cross section, and (iii) if and if .The expression ofthe 
reaction force is an algebraic inclusion [9, 10], meaning that at the instants of vanishing 
velocity the reaction may belong to an interval and at the instants of velocity sign change 
the reaction per unit length is discontinuous. This reaction is very different from the one 
provided by a continuous distribution ofthe traditional linear viscous dampers,, where c 
is the viscous damping coefficient per unit length. In both cases the reaction opposes the 
velocity but, while viscous damping provides a reaction that is proportional to the local 
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velocity itself, the frictional reaction is limited to the interval [- f, + f] and is independent 
1 of the magnitude of the velocity (see Fig. 2 in [4]). In that study a time stepping algorithm 
en v ffi!e specially designed to deal with non-smooth dynamical systems was for the first time 
m9emenas applied to beams on distributed friction foundations and new conclusions on critical 
velocities, maximal displacements and dynamic amplification factors were drawn. 



FIGURE 1. Beam on a frictionally damped continuous foundation under a moving load. 

Since the FE analyses in [4] are (i) very time consuming (thus unfeasible for fast 
engineering estimations), and (ii) require advanced know-how and software that go 
beyond the available resources of typical civil engineering firms, this paper aims to 
demonstrate the potential of Artificial Neural Networks (ANN) to effectively predict 
the maximum displacements in railway tracks on frictional foundations, as function 
of the frictional parameter (fu) and load velocity (v). This is an important step towards 
the future development of much more versatile ANN-based analytical models for the 
same type of problem. The difference will be the inclusion of more independent (input) 
variables, such as the foundation stiffness modulus A, the applied load magnitude/ 7 , and 
geometrical/mechanical properties of the railway beam (see Fig. 1). 


FE-BASED MODEL AND DATA GATHERING 

The authors considered a horizontal simply supported linear elastic Euler-Bernoulli 
beam (see Fig. 1) of 200 m length, cross-sectional area and central moment of inertia 
respectively equal to A and /, mass density p and Young's modulus E. The properties of 
the beam are summarized inTable 1 and correspond to the ones of a UIC60 rail. Previous 
studies [5-7] showed that a 200 m simply supported beam is a good finite length model 
to approximate the behavior of an infinite beam on elastic foundation with a single 
moving load. The beam is supposed to be connected to a fixed foundation bed by a 
system of linear elastic springs, with stiffness per unit length denoted by k, associated 
in parallel with a continuous distribution of friction dampers with a maximum force per 
unit length f u . A downward concentrated force F = 83.4 kN, corresponding to half of the 
load per axle of a Thalys high speed train locomotive, acts on the beam moving from 
left to right at a constant velocity v (numerical results considered v ranging between 
50 m/s and 300 m/s at intervals of 5 m/s). The motion of the beam is governed by a 
partial differential inclusion (eq. (2) in [4]) that is (i) semi-discretized in space, using 
the finite element method, as a system of ordinary differential inclusions (eq. (5) in 
[4]), and (ii) integrated in time using a special implementation of the Non-smooth 
Contact Dynamics Method (NSCD) developed by Moreau [11] and Jean [12], adapted 
to distributed Coulomb friction. The stiffness modulus of the foundation is k= 250 kN/ 
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m2 and thirteen different values of the maximum force per unit length of the frictional 
dampers f u (0,1,2, 3, 3.5,4, 5, 6, 7, 7.5, 8, 9 and 10 kN/m) are considered. All simulations 
employed a 200-element uniform mesh and a time step h such that vh = 0.1 m (i.e., in 
each time step the load progresses 10 cm independently of its velocity v). The suitability 
of the mesh refinement and time step was assessed in [4]. The self-weight of the beam 
is not considered, as in [4], in order to allow the comparison of results. 

The maximum upward (positive w ma ) and downward (negative w ma ) transverse 
displacements of the beam under moving load are obtained as function of the frictional 
parameter f u and load velocity v, using the FE program mentioned before. For each pair 
of f u and v values, the computational time to obtain the corresponding upward and 
downward maximum displacements ranged between 4000 and 4800 seconds, when 
using a computer with an Intel® Core™ i5-3470 CPU @ 3.20 GFIz, 8 GB of RAM, and a 64- 
bit Operating System. 

The aforementioned data was then used to feed the neural networks analyzed in this 
work. After some experiments, and aiming to obtain sufficiently accurate (maximum 
error smaller or around 5%) models, it was decided to develop four independent ANN- 
based models, namely for: (i) negative w max {v= [50, 175] U [250, 300] m/s), (ii) negative 
w mJv = ] 175, 250[ m/s), (iii) positive w max {v = [50,175] U [250,300] m/s), and (iv) positive 
w max ( v = 75, 250[ m/s). The choice for these data ranges is easily understandable if one 
observes the last figure in this manuscript. Each model is described in sub-sections 4.1-4.4, 
respectively, and characterized by two independent (f u , v) and one dependent (positive or 
negative w m J variables. The datasets used in the development and final testing of each 
model are available in [13]. 


TABLE 1. Properties of the UIC60 rail [4]. 


1 Property 

Value I 

Young's modulus (£) 

210 GPa 

Central area moment of inertia (/) 

3055 x 10" 8 m 4 

Cross-sectional area ( A ) 

7684 x 10 -6 m 2 

Density ( p ) 

7800 kg/m 3 


ARTIFICIAL NEURAL NETWORKS 
Brief Introduction 

Since artificial neural networks (ANN) are a machine learning technique widely described 
in the scientific literature, please refer to [14] for a more thorough presentation of this 
topic. ANN's computing power makes them suitable to efficiently solve small to large- 
scale complex problems, which can be attributed to their (i) massively parallel distributed 
structure and (ii) ability to learn and generalize, i.e, produce reasonably accurate outputs 
for inputs not used during the learning phase. The general ANN structure consists of 
several nodes disposed in L vertical layers (input layer, hidden layers, and output layer) 
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and connected between them, as depicted in Fig. 2. Associated to each node in layers 2 
. lp| to L, also called neuron, is a linear or nonlinear transfer function, which receives an input 
en v ffi!e and transmits an output. In this work, only feedforward networks were implemented, i.e. 
m9emenas output from any node is only transmitted to nodes located in subsequent layers. 


Input 



FIGURE 2. Example of a feedforward network with node structure 3-2-1 


Learning 

Learning is nothing else than determining network unknown parameters through some 
algorithm in order to minimize network's performance measure, typically a function of 
the difference between predicted and target (desired) outputs. When ANN learning is 
iterative in nature, it consists of three phases: (i) training, (ii) validation, and (iii) testing. 
From previous knowledge, examples or data points are selected to train the network, 
grouped in the so-called training dataset. During an iterative learning, while the training 
dataset is used to tune network unknowns, a process of cross-validation takes place 
by using a set of data completely distinct from the training counterpart (the validation 
dataset), so that the generalization performance of the network can be attested. Once 
'optimum'network parameters are determined, typically associated to a minimum of the 
validation performance curve (called early stop - see Fig. 3), many authors still perform 
a final assessment of model's accuracy, by presenting to it a third fully distinct dataset 
called 'testing'. Heuristics suggests that early stopping avoids overfitting, i.e. the loss of 
ANN's generalization ability. 



FIGURE 3. Assessing ANN's generalization ability via cross-validation. 
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Implemented ANN features 

The'behavior'of any ANN depends on many'features', having been implemented 15 ANN 
features in this work (including data pre/post processing ones). For those features, it is 
importantto bear in mind that no ANN guarantees good approximations via extrapolation, 
i.e. the implemented ANNs should not be applied outside the input variable ranges used 
for network training. Since there are no objective rules dictating which method per 
feature guarantees the best network performance for a specific problem, an extensive 
parametric analysis (composed of nine parametric sub-analyses) was carried out to find 
'the optimum' net design. A description of all methods/formulations implemented for 
each ANN feature (see Tabs. 2-4) - they are a selection from state of art literature on ANNs, 
including both traditional and promising modern techniques, can be found in previous 
published works (e.g., [14,15]) - the reader might need to go through it to fully understand 
the meaning of all variables reported in this manuscript. The whole work was coded in 
MATLAB [8], making use of its neural network toolbox when dealing with popular learning 
algorithms (1 -3 in Tab. 4). Each parametric sub-analysis (SA) consists of running all feasible 
combinations (also called 'combos') of pre-selected methods for each ANN feature, in 
order to get performance results for each designed net, thus allowing the selection of the 
best ANN according to a certain criterion. The best network in each parametric SA is the 
one exhibiting the smallest average relative error (called performance) for all learning data. 

TABLE 2. Implemented ANN features (F) 1-5. 


FEATURE 

METHOD 

1 

FI 

F2 

F3 

F4 

F5 

Qualitative 
Var Represent 

Dimensional 

Analysis 

Input 

Dimensionality 

Reduction 

% 

Train-Valid- 
Test 

Input 

Normalization 

Boolean 

Vectors 

Yes 

Linear Correlation 

80-10-10 

Linear Max Abs 

2 

Eq Spaced in 
]0,1] 

No 

Auto-Encoder 

70-15-15 

Linear [0,1] 

3 

- 

- 

- 

60-20-20 

Linear [-1,1] 

4 

- 

- 

Ortho Rand Proj 

50-25-25 

Nonlinear 

5 

- 

- 

Sparse Rand Proj 

- 

Lin Mean Std 

6 

- 

- 

No 

- 

No 


TABLE 3. Implemented ANN features (F) 6-10. 


FEATURE 

METHOD 

F6 

F7 

F8 

F9 

F10 

Output 

Transfer 

Output Normalization 

Net 

Architecture 

Hidden 

Layers 

Connectivity 

1 

Logistic 

Lin [a, b] = 0.7[ip min , (p m J 

MLPN 

1 HL 

Adjacent Layers 

2 

- 

Lin [a, b] = 0.6[cp min , cp m J 

RBLN 

2 HL 

Adj Layers + In-Out 

3 

Hyperbolic 

Tang 

Lin [a, b] = 0.5[cp min , cp m J 

- 

3 HL 

Lully-Connected 

4 

- 

Linear Mean Std 

- 

- 

- 

5 

Bilinear 

No 

- 

- 

- 

6 

Com pet 

- 

- 

- 

- 

7 

Identity 

- 

- 

- 

- 
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TABLE 4. Implemented ANN features (F) 11-15. 


FEATURE 

METHOD 

F11 

F12 

F13 

F14 

F15 

Hidden Transfer 

Parameter Initialization 

Learning 

Algorithm 

Performance 

Improvement 

Training 

Mode 

1 

Logistic 

Midpoint (W) + Rands (b) 

BP 

NNC 

Batch 

2 

Identity-Logistic 

Rands 

BPA 

- 

Mini-Batch 

3 

HyperbolicTang 

Randnc (W) + Rands (b) 

LM 

- 

Online 

4 

Bipolar 

Randnr (W) + Rands (b) 

ELM 

- 

- 

5 

Bilinear 

Randsmall 

mb ELM 

- 

- 

6 

Positive Sat Linear 

Rand [-441 

l-ELM 

- 

- 

7 

Sinusoid 

SVD 

CI-ELM 

- 

- 

8 

Thin-Plate Spline 

MBSVD 

- 

- 

- 

9 

Gaussian 

- 

- 

- 

- 

10 

Multiquadratic 

- 

- 

- 

- 

11 

Radbas 

- 

- 

- 

- 


Network Performance Assessment 


Several types of results were computed to assess network outputs, namely (i) maximum 
error, (ii) % errors greater than 3%, and (iii) performance, which are defined next. All 
abovementioned errors are relative errors (expressed in %) based on the following 
definition, concerning a single output variable and data pattern, 


^=ioo 



(D 


qp 

where (i) d is the g f/1 desired (or target) output when pattern p within iteration / (p=1,..., 
P) is presented to the network, and (ii) y is net's q th output for the same data pattern. 
Moreover, denominator in eq. (1) is replaced by 1 whenever \d qp \ < 0.05 - d qp in the 
nominator keeps its real value. This exception to eq. (1) aims to reduce the apparent 
negative effect of large relative errors associated to target values close to zero. 


Maximum Error 

This variable measures the maximum relative error, as defined by eq. (1), among all 
output variables and learning patterns. 


Percentage of Errors > 3% 

This variable measures the percentage of relative errors, as defined by eq. (1), among all 
output variables and learning patterns, that are greater than 3%. 


Performance 

In functional approximation problems, network performance is defined as the average 
relative error, as defined in eq. (1), among all output variables and data patterns being 
evaluated (e.g., training, all data). 
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Software Validation 

Several benchmark datasets/functions were used to validate the developed software, 
involving low- to high-dimensional problems and small to large volumes of data. Due 
to paper length limit, validation results are not presented herein but they were made 
public online [16]. 


RESULTS AND PROPOSED ANN-BASED MODELS 

Aiming to reduce the computing time by cutting in the number of combos to be run 
- note that all features combined lead to hundreds of millions of combos, the whole 
parametric simulation was divided into nine parametric sub-analyses (SAs), where in 
each one feature 7 only takes a single value.This measure aims to make the performance 
ranking of all combos within each SA analysis more 'reliable', since results used for 
comparison are based on target and output datasets as used in ANN training and 
yielded by the designed network, respectively (they are free of any postprocessing that 
eliminates output normalization effects on relative error values). Whereas (i) the 1 st and 
2 nd SAs aimed to select the best methods from features 1,2,5, 8 and 13 (all combined), 
while adopting a single popular method for each of the remaining features (F : 6, F 4 : 
2, F 6 :{1 or 7}, Fy 1, F g : 1, F 10 : 1, F n : {3, 9 or 11}, F 12 : 2, F 14 : 1, F 15 : 1 - see Tables 2-4) - SA 1 
involved learning algorithms 1-3 and SA 2 involved the ELM-based counterpart, (ii) the 
3 rd - 7 th SAs combined all possible methods from features 3,4,6 and 7, and concerning 
all other features, adopted the methods integrating the best combination from the 
aforementioned first SA, (iii) the 8 th SA combined all possible methods from features 
11,12 and 14, and concerning all other features, adopted the methods integrating the 
best combination (results compared after postprocessing) among the previous five sub¬ 
analyses, and lastly (iv) the 9 th SA combined all possible methods from features 9, 10 
and 15, and concerning all other features, adopted the methods integrating the best 
combination from the previous analysis. The microprocessors used for ANN simulations 
have the following features: OS: WinlOFIome 64bits, RAMs: 48.0 / 128 GB, Local Disk 
Memory: 1 TB, CPUs: Intel® Core™ i7 8700K @ 3.7-4.7 GHz / i9 7960X @ 2.8-4.2 GHz. The 
datasets used in the development and final testing of all ANN models are available 
in [13]. In what follows, the parametric analysis results and the proposed ANN-based 
models are presented for each of the four problems addressed, namely: (i) negative w max 
(v = [50,175] U [250,300] m/s), (ii) negative w max (v = ] 1 75,250[ m/s), (iii) positive w max (v 
= [50,175] U [250,300] m/s), and (iv) positive w max (v = ]175,250[ m/s). It is important to 
note that, in this manuscript, whenever a vector is added to a matrix, it means the former 
is added to all columns of the latter (valid in MATLAB). 

Negative w max (v = [50,175] U [250,300] m/s) 

ANN feature methods used in the best combo from each of the abovementioned nine 
parametric SAs are specified in Table 5 (see Tables 2-4). Table 6 shows the corresponding 
relevant results for those combos and the 481 -point final testing dataset (which includes 
the ANN learning/development dataset), namely (i) maximum error, (ii) percentage of 
errors larger than 3%, (iii) performance (all described in sub-section 3.4, and evaluated 
for all learning data), (iv) total number of hidden nodes in the model, and (v) average 
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computing time per example (including data pre- and post-processing). All results shown 
in Table 6 are based on target and output datasets computed in their original format, i.e. 
free of any transformations due to output normalization and/or dimensional analysis. 
Summing up the ANN feature combinations for all parametric SAs, a total of 219 combos 
were run for this problem. 


TABLE 5. ANN feature (F) methods used in the best combo from each parametric sub-analysis (SA). 


SA 1 

F2 

F3 

F4 

F5 

F6 

F7 

F8 

F9 

F10 

F11 

F12 

F13 

F14 

F15 

1 1 

2 

6 

2 

5 

1 

1 

1 

1 

1 

3 

2 

3 

1 

3 

2 1 

2 

6 

2 

5 

7 

1 

1 

1 

1 

3 

2 

5 

1 

3 

3 1 

2 

6 

2 

5 

1 

1 

1 

1 

1 

3 

2 

3 

1 

3 

4 1 

2 

6 

2 

5 

1 

2 

1 

1 

1 

3 

2 

3 

1 

3 

5 1 

2 

6 

4 

5 

1 

3 

1 

1 

1 

3 

2 

3 

1 

3 

6 1 

2 

6 

2 

5 

7 

4 

1 

1 

1 

3 

2 

3 

1 

3 

7 1 

2 

6 

4 

5 

3 

5 

1 

1 

1 

3 

2 

3 

1 

3 

8 1 

2 

6 

4 

5 

3 

5 

1 

1 

1 

3 

5 

3 

1 

3 

9 1 

2 

6 

4 

5 

3 

5 

1 

3 

3 

3 

5 

3 

1 

3 


The proposed model is the one, among the best ones from all parametric SAs, exhibiting 
the lowest maximum error (SA 7 - a Neural Network Composite (NNC)). Aiming to allow 
implementation of this model by any user, all variables/equations required for (i) data 
preprocessing, (ii) ANN simulation, and (iii) data postprocessing, are presented in the 
following sub-sections.The proposed model is an NNC made of 7 ANNs with architecture 
MLPN and a distribution of nodes/layer equal to 2-8-1 for every network. Concerning 
connectivity, all networks are partially-connected, and the hidden and output transfer 
functions are all Hyperbolic Tangent. All networks were trained using the LM algorithm. 
After design, the average NNC computing time concerning the presentation of a single 
example (including data pre/postprocessing) is 3.12E-05 s; Fig. 4 depicts a simplified 
scheme of each NNC network. Finally, all relevant performance results concerning the 
proposed NNC are illustrated in sub-section 4.1.4. 

TABLE 6. Performance results for the best design from each parametric SA and for the final testing dataset 
(includes the ANN learning/development dataset): ANN and NNC. 
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ANN 

SA 

Max Error 

(%) 

Performance 
All Data 

(%) 

Errors > 3% 

(%) 

Total Hidden 
Nodes 

Running Time/ 
Data Point 
(s) 

1 

5.6 

1.0 

3.7 

8 

7.95 E-05 

2 

15.3 

2.3 

20.4 

40 

2.89E-05 

3 

5.7 

1.0 

3.7 

8 

2.88E-05 

4 

6.3 

0.9 

3.5 

8 

2.91 E-05 

5 

5.7 

1.0 

2.7 

8 

3.64E-05 

6 

5.3 

1.0 

4.8 

8 

4.86E-05 
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7 

6.1 

1.0 

4.6 

8 

2.75 E-05 

8 

7.5 

1.0 

4.4 

8 

4.66E-05 

9 

3.2 

0.6 

0.2 

9 

4.36E-05 


SA 

1 

2 

3 

4 

5 

6 

7 

8 


NNC 


Max Error 

(%) 


Performance 
All Data 

(%) 


Errors > 3% Total Hidden 
(%) Nodes 


Running Time/ 
Data Point 
(s) 


2.7 

0.4 

0.0 

8 

3.12E-05 

5.8 

0.8 

2.3 

8 

4.89E-05 


N (outputs). 

-<l)—*( i) — 

i th MLPN (i = 1,...,7) 

(NNC computing time = 3.12E-05 s/example) 

FIGURE 4. Proposed NNC made of 7 partially-connected MLPNs - simplified scheme. 

Input Data Preprocessing 

For future use of the proposed NNC to simulate new data Y Jsim (a 2 x P. m matrix) 
concerning P. m patterns, the same data preprocessing (if any) performed before training 
must be applied to the input dataset.That preprocessing is defined by the methods used 
for ANN features 2, 3 and 5 (respectively 2, 6 and 5 - see Table 2). Next, the necessary 
preprocessing to be applied to Y y sjm , concerning features 2,3 and 5, is fully described. 

Dimensional Analysis and Dimensionality Reduction 

Since no dimensional analysis (d.a.) nor dimensionality reduction (of.r.) were carried out, 
one has 

{wr=k 




Input Normalization 

After input normalization, the new input dataset [Y ] si J a n fter is defined as function of the 
previously determined [Y ] sj J a d f r ter , and they have the same size, reading 
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(R4r -JNPC, 1 )) •/ INP(:,2) 


INP = 


5000 3166.27801301405 

161.527777777778 82.2184600671207 


(3) 


where one recalls that (i) INP(:, j) represents column j of matrix INP, and (ii) operator 7' 
divides row i in the numerator by INP(i, 2). 

ANN-Based Analytical Model 

Once determined the preprocessed input dataset [Y } sjm } a n fter (a 2 x P im matrix), the 
next step is to present it to the proposed NNC to obtain the predicted output dataset 
{Y 3sjm } a n fter (a 1 x P. m vector), which will be given in the same preprocessed format of the 
target dataset used in learning. In order to convert the predicted outputs to their'original 
format' (i.e., without any transformation due to normalization or dimensional analysis), 
some postprocessing might be needed, as discussed in 4.1.3. Next, the mathematical 
representation of the proposed NNC is given, so that any user can implement it to 
determine { Y 3sjm } a n fter , thus contributing to diminish the generalized opinion that ANNs are 
'black boxes': 

(y \ after = {v \ after + YiY Y fter 

r 3 ,sim] n r 3 ,simj n{0) Z-J j„ (;) (4) 

being 

r m =v 2 (irl m {K,<.}r +b m) 

,+No) (5) 

where i = 0 ,..., 6 and 

e s —s 

— 0 

<P 2 =<P3=<P( S )= s , ( 6 ) 

e +e 

and arrays W (i) and b s(i) are stored online in [17], aiming to avoid an overlong article and 
ease model's implementation by any interested reader. 

Output Data Postprocessing 

In order to transform the output dataset obtained by the proposed NNC, { Y 3sjm }° fter (a 1 
x P sjm vector), to its original format {Y 3 jm ), i.e. without the effects of dimensional analysis 
and/or output normalization (possibly) taken in target dataset preprocessing prior 
training, one has 



adopted in the proposed model. 
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FIGURE 5. Regression plot for the proposed NNC. 


Performance Results 

Finally, the results yielded by the proposed NNC for the 481-point final testing dataset 
(which includes the ANN learning/development counterpart), in terms of performance 
variables defined in sub-section 3.4, are presented in this sub-section in the form of two 
graphs: (i) a regression plot (Fig. 5), where network target and output data are plotted, 
for each data point, as x- and y- coordinates, respectively - a measure of quality is given 
by the Pearson Correlation Coefficient (/?); and (ii) a plot (Fig. 6) indicating (for all data) 
the (iil) maximum error, (ii2) percentage of errors larger than 3%, and (ii3) average error 
(called performance). 


2.7% 



max error NNC % errors >3% NNC performance NNC 


FIGURE 6. Maximum and average (performance) errors for the proposed NNC. 
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Negative w max ( v= ]175,250[ m/s) 

ANN feature methods used in the best combo from each of the abovementioned nine 
parametric SAs are specified in Table 7 (numbers represent the method number as in 
Tables 2-4). Table 8 shows the corresponding relevant results for those combos and the 
208-point final testing dataset (which includes the ANN learning/development dataset), 
namely (i) maximum error, (ii) percentage of errors larger than 3%, (iii) performance (all 
described in sub-section 3.4, and evaluated for all learning data), (iv) total number of 
hidden nodes in the model, and (v) average computing time per example (including 
data pre- and post-processing). All results shown in Table 8 are based on target and 
output datasets computed in their original format, i.e. free of any transformations due 
to output normalization and/or dimensional analysis. Summing up the ANN feature 
combinations for all parametric SAs, a total of 204 combos were run for this problem. 


The proposed model is the one, among the best ones from all parametric SAs, exhibiting 
the lowest maximum error (SA 9 - a Neural Network Composite (NNC)). Aiming to allow 
implementation of this model by any user, all variables/equations required for (i) data 
preprocessing, (ii) ANN simulation, and (iii) data postprocessing, are presented in the 
following sub-sections.The proposed model is an NNC made of 3 ANNs with architecture 
RBFN and a distribution of nodes/layer given by 2-3-3-3-1 for every network. Concerning 
connectivity, all networks are partially-connected (see Fig. 7), and the hidden and output 
transfer functions are all Gaussian RBF and FlyperbolicTangent, respectively. All networks 
were trained using the LM algorithm. After design, the average NNC computing time 
concerning the presentation of a single example (including data pre/postprocessing) is 
7.87E-05 s. Fig. 7 depicts a simplified scheme of each NNC network. Finally, all relevant 
performance results concerning the proposed NNC are illustrated in sub-section 4.2.4. 

TABLE 7. ANN feature (F) methods used in the best combo from each parametric sub-analysis (SA). 


SA 1 

F2 

F3 

F4 

F5 

F6 

F7 

F8 

F9 

F10 

F11 

F12 

F13 

F14 

F15 

1 1 

2 

6 

2 

5 

1 

1 

2 

1 

1 

11 

2 

3 

1 

3 

2 1 

2 

6 

2 

3 

7 

1 

2 

1 

1 

9 

2 

5 

1 

3 

3 1 

2 

6 

1 

5 

1 

1 

2 

1 

1 

11 

2 

3 

1 

3 

4 1 

2 

6 

1 

5 

1 

2 

2 

1 

1 

11 

2 

3 

1 

3 

5 1 

2 

6 

3 

5 

1 

3 

2 

1 

1 

11 

2 

3 

1 

3 

6 1 

2 

6 

1 

5 

7 

4 

2 

1 

1 

11 

2 

3 

1 

3 

7 1 

2 

6 

1 

5 

3 

5 

2 

1 

1 

11 

2 

3 

1 

3 

8 1 

2 

6 

1 

5 

3 

5 

2 

1 

1 

11 

2 

3 

1 

3 

9 1 

2 

6 

1 

5 

3 

5 

2 

3 

2 

11 

2 

3 

1 

3 


input 





ith RBFN (i = 1,..„ 3) 

(NNC computing time = 7.87E-05 s/example) 


FIGURE 7. Proposed NNC made of 3 partially-connected RBFNs - simplified scheme. 
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Input Data Preprocessing 

For future use of the proposed NNC to simulate new data Y ]sjm (a 2 x P matrix) 
concerning P patterns, the same data preprocessing (if any) performed before 
training must be applied to the input dataset. That is defined by the methods used 
for ANN features 2, 3 and 5 (respectively 2, 6 and 5 - see Table 2). Next, the necessary 
preprocessing to be applied to Y ] sjm is fully described. 

TABLE 8. Performance results for the best design from each parametric SA and for the final testing dataset 
(includes the ANN learning/development dataset): ANN and NNC. 



ANN 

SA 

Max Error 

(%) 

Performance 
All Data 

(%) 

Errors > 3% 

(%) 

Total Hidden 
Nodes 

Running Time/ 
Data Point 
(s) 

1 

10.7 

2.8 

38.9 

8 

1.68E-04 

2 

102.8 

16.7 

69.7 

50 

5.50E-05 

3 

12.1 

3.1 

43.3 

8 

5.26E-05 

4 

12.0 

3.1 

44.7 

8 

4.11E-05 

5 

20.7 

3.0 

38.5 

8 

8.19E-05 

6 

22.8 

3.0 

40.9 

8 

3.84E-05 

7 

18.7 

2.7 

32.7 

8 

4.47 E-05 

8 

14.2 

3.0 

39.4 

8 

5.63 E-05 

9 

11.4 

1.8 

13.9 

9 

5.10E-05 




NNC 



SA 

Max Error 

(%) 

Performance 
All Data 

(%) 

Errors > 3% 

(%) 

Total Hidden 
Nodes 

Running Time/ 
Data Point 
(s) 

1 

2 

o 

- 

- 

- 

- 

- 

D 

4 

- 

- 

- 

- 

- 

5 

12.0 

2.5 

29.8 

8 

1.16E-04 

6 

8.0 

1.0 

6.3 

8 

6.63 E-05 

7 

8.4 

1.8 

20.2 

8 

5.5 2 E-05 

8 

- 

- 

- 

- 

- 

9 

5.2 

1.1 

4.8 

9 

7.87E-05 


Dimensional Analysis and Dimensionality Reduction 

Since no dimensional analysis ( d.a .) nor dimensionality reduction (d.r.) were carried out, 
one has 
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ANN-Based Analytical Model 

Once determined the preprocessed input dataset {Y ] sim }f er (a 2 x P. m matrix), the next 
step is to present it to the proposed NNC to obtain the predicted output dataset {/ 5 sim }f er 
(a 1 xP sim vector), which will be given in the same preprocessed format of the target dataset 
used in learning.To convert the predicted outputs totheirbriginal format'(i.e., without any 
transformation due to normalization or dimensional analysis), some postprocessing might 
be needed, as described in 4.2.3. Next, the mathematical representation of the proposed 
NNC is given, so that any user can implement it to determine { Y 5sjm }° fter ■ 



yfter yfter _ 

'l l ’ sim J d.r. I l ’ sim )d.a. l ’ sim 


( 8 ) 


Input Normalization 


After input normalization, the new input dataset is defined as function of the previously 
determined , and they have the same size, reading 

(Ur = (MT ■ INP ( ; ’ 1 )) •/ INP(:,2) 


INP = 


5000 3171.29986868837 
212.5 23.1146212230639 


(9) 


where one recalls that operator 7'divides row / in the numerator by INP(/, 2). 


where (l = 0,1,2) 

*2(0 = ^ 2 (^( 0 ) 

- 2 d)( l nP) = ||{^m}7 2(/)( : > / l)|| 


S d) = OWC 2 (/) X *2(0 


^3(0 - *Pl Co ) 

aUX i(i) (A > p) = 1^2(0 ("> p) — ^2-3(0 (•> A ) 


S (l) = aux m .X b m 


^4(0 “04 Co) 


aUX 4(j) (A » /*) k(0 (•> /0 ^3-4(0 (•» A 


•5(0 = ««*40) X *4(0 


( 11 ) 
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and (i) p=1,..., P sim , / =1,2,3, (ii) operator'x'multiplies every element in row s of the first 
array by element s of second array (a vector), yielding an array of the same size of the 
first, and (iii) , s _„i 

<P2=<P?>=(Pa=<P\ s ) = q - 


<Ps = <P5( S ) = 


e -e 


e +e 


( 12 ) 


Arrays W js{j) and b s{j) are stored online in [18]. 


Output Data Postprocessing 


In order to transform the output dataset obtained by the proposed NNC, { Y 5sim } n after (a 1 
x P sjm vector), to its original format (/ ), i.e. without the effects of dimensional analysis 

and/or output normalization (possibly) taken in target dataset preprocessing prior 
training, one has 



since no output normalization nor dimensional analysis were adopted in the proposed 
model. 


:R = 0.99971 



Target 

FIGURE 8. Regression plot for the proposed NNC. 


Performance Results 

Finally, the results yielded by the proposed NNC for the 208-point final testing dataset 
(which includes the ANN learning/development counterpart), in terms of performance 
variables defined in sub-section 3.4, are presented in this sub-section in the form of two 
graphs: (i) a regression plot (Fig. 8), where networktargetand output data are plotted, for 
each data point, asX- and y- coordinates, respectively; and (ii) a plot (Fig. 9) indicating 
(for all data) the (iii) maximum error, (ii2) percentage of errors larger than 3%, and (ii3) 
average error (called performance). 
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max error NNC % errors > 3% NNC performance NNC 


FIGURE 9. Maximum and average (performance) errors for the proposed NNC. 

Positive w mM (v= [50,175] U [250,300] m/s) 


ANN feature methods used in the best combo from each of the abovementioned nine 
parametric SAs are specified in Table 9 (numbers represent the method number as in 
Tables 2-4). Table 10 shows the corresponding relevant results for those combos and the 
481-point final testing dataset (which includes the ANN learning/development dataset), 
namely (i) maximum error, (ii) percentage of errors larger than 3%, (iii) performance (all 
described in sub-section 3.4, and evaluated for all learning data), (iv) total number of 
hidden nodes in the model, and (v) average computing time per example (including data 
pre- and post-processing). All results shown in Table 10 are based on target and output 
datasets computed in their original format, i.e. free of any transformations due to output 
normalization and/or dimensional analysis. Summing up the ANN feature combinations 
for all parametric SAs, a total of 219 combos were run for this problem. 


TABLE 9. ANN feature (F) methods used in the best combo from each parametric sub-analysis (SA). 


FI 

F2 

F3 

F4 

F5 

F6 

F7 

F8 

F9 

F10 

F11 

F12 

F13 

F14 

F15 

1 

2 

6 

2 

3 

1 

1 

1 

1 

1 

3 

2 

3 

1 

3 

1 

2 

6 

2 

2 

7 

1 

2 

1 

1 

9 

2 

5 

1 

3 

1 

2 

6 

3 

3 

1 

1 

1 

1 

1 

3 

2 

3 

1 

3 

1 

2 

6 

3 

3 

1 

2 

1 

1 

1 

3 

2 

3 

1 

3 

1 

2 

6 

1 

3 

1 

3 

1 

1 

1 

3 

2 

3 

1 

3 

1 

2 

6 

4 

3 

7 

4 

1 

1 

1 

3 

2 

3 

1 

3 

1 

2 

6 

4 

3 

7 

5 

1 

1 

1 

3 

2 

3 

1 

3 

1 

2 

6 

4 

3 

7 

5 

1 

1 

1 

1 

2 

3 

1 

3 

1 

2 

6 

4 

3 

7 

5 

1 

3 

3 

1 

2 

3 

1 

3 


170 























Abambres / Correa / Pinto da Costa / Simoes (2019) 


The proposed model is the one, among the best ones from all parametric SAs, exhibiting 
the lowest maximum error (SA 9). Aiming to allow implementation of this model by any 
user, all variables/equations required for (i) data preprocessing, (ii) ANN simulation, and (iii) 
data postprocessing, are presented in the following sub-sections.The proposed model is a 
single MLPN with 5 layers and a distribution of nodes/layer given by 2-3-3-3-1. Concerning 
connectivity, the network is fully-connected, and the hidden and output transfer 
functions are all Logistic and Identity, respectively. The network was trained using the LM 
algorithm (1500 epochs). After design, the average network computing time concerning 
the presentation of a single example (including data pre/postprocessing) is 2.49E-05 s; 
Fig. 10 depicts a simplified scheme of some of network key features. Finally, all relevant 
performance results concerning the proposed ANN are illustrated in sub-section 4.3.4. 



FIGURE 10. Proposed 2-3-3-3-1 fully-connected MLPN- simplified scheme. 


Table 10. Performance results for the best design from each parametric SA and for the final testing dataset 
(includes the ANN learning/development dataset): ANN and NNC. 





ANN 



SA 

Max Error 

Performance 

Errors > 3% 

Total Hidden 

Running Time / Data 

(%) 

All Data (%) 

(%) 

Nodes 

Point (s) 

1 

7.4 

0.5 

7.1 

8 

2.58E-05 

2 

8.2 

0.7 

8.5 

60 

3.43E-05 

3 

6.7 

0.6 

7.7 

8 

4.52E-05 

4 

6.9 

0.5 

7.5 

8 

2.70E-05 

5 

7.0 

0.5 

7.3 

8 

2.62E-05 

6 

6.2 

0.5 

8.3 

8 

2.65 E-05 

7 

6.7 

0.5 

6.7 

8 

3.72E-05 

8 

6.4 

0.5 

7.5 

8 

2.91 E-05 

9 

3.7 

0.2 

0.8 

NNC 

9 

2.49E-05 

SA 

Max Error 

Performance 

Errors > 3% 

Total Hidden 

Running Time / Data 

(%) 

All Data (%) 

(%) 

Nodes 

Point (s) 

1 

2 

- 

- 

- 

- 

- 

3 

- 

- 

- 

- 

- 

4 

6.9 

0.4 

5.8 

8 

2.87E-05 

5 

- 

- 

- 

- 

- 

6 

3.7 

0.2 

0.2 

8 

3.29E-05 

7 

6.4 

0.5 

6.7 

8 

4.16E-05 

8 

6.2 

0.5 

7.1 

8 

2.98E-05 

9 

- 

- 

- 

- 

- 
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Input Data Preprocessing 

For future use of the proposed ANN to simulate new data Y ]sjm (a 2 x P. m matrix) 
concerning P. m patterns,the same data preprocessing (if any) performed before training 
must be applied to the input dataset. That preprocessing is defined by the methods 
used for ANN features 2, 3 and 5 (respectively 2, 6 and 3 - see Table 2). In what follows, 
the necessary preprocessing to be applied to Y y sjm is fully described. 


Dimensional Analysis and Dimensionality Reduction 

Since no dimensional analysis {d.a.) nor dimensionality reduction (cf.r.) were carried out, 
one has 



(14) 


Input Normalization 

After input normalization, the new input dataset is defined as function of the previously 
determined , and they have the same size, reading 



= INP(:4) + raZ?.x|{^. m }^ r 


INP = 


-1 1 0 
-1 1 50 


10000 

300 



mb = INP(:,2) - INP(:,1) 
den = INP(:,4) - INP(:,3) 


(15) 


where one recalls that operator'.x'multiplies component /in vector rab by all components 
in row / of subsequent term (analogous definition holds for 7'). 


ANN-Based Analytical Model 


Once determined the preprocessed input dataset {Y ]sim }° fter (a 2 x P. m matrix), the 
next step is to present it to the proposed ANN to obtain the predicted output dataset 
{Y 5si X er ( a 1 x ^ sim vector), which will be given in the same preprocessed format of the 
target dataset used in learning. In order to convert the predicted outputs to their'original 
format' (i.e., without any transformation due to normalization or dimensional analysis), 
some postprocessing might be needed, as described in 4.3.3. Next, the mathematical 
representation of the proposed ANN is given, so that any user can implement it to 
determine!/, . } after : 

(16) 

Y 3 =n(K 3 {Y^yr +w ^ Y 2 +b i) 
y*=<p< (k< {lur++*4) 

{X,4r = <Ps(Ks {Y h ,X er+ KsY 2 +K s Y 3 +W 4 T _ 5 Y 4 +b 5 ) 
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where 



( 17 ) 


Arrays W and b s can be found online in [19]. 

Output Data Postprocessing 

In order to transform the output dataset obtained by the proposed ANN, {Y Ssi Jf er 
(a 1 x P sjm vector), to its original format (Y 5sj J f i.e. without the effects of dimensional 
analysis and/or output normalization (possibly) taken in target dataset preprocessing 
prior training, one has 


Y = iY ] after 

5,57m ( 5 ,sim J n 


(18) 


since no output normalization nor dimensional analysis were adopted in the proposed model. 


: R = 0.99984 



0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 

Target 


FIGURE 11. Regression plot for the proposed ANN. 


Performance Results 

Finally, the results yielded by the proposed ANN for the 481-point final testing dataset 
(which includes the ANN learning/development counterpart), in terms of performance 
variables defined in sub-section 3.4, are presented in this sub-section in the form of 
two graphs: (i) a regression plot (Fig. 11), where network target and output data are 
plotted, for each data point, as x- and y- coordinates, respectively; and (ii) a plot (Fig. 12) 
indicating (for all data) the (ii) maximum error, (ii 2 ) percentage of errors larger than 3%, 
and (ii 3 ) average error (called performance). 
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3.7% 



max error % errors > 3% performance 


FIGURE 12. Maximum and average (performance) errors for the proposed ANN. 

Positive w max (v = ]175,250[ m/s) 


ANN feature methods used in the best combo from each of the abovementioned nine 
parametric SAs are specified in Table 11 (numbers represent the method number as in 
Tables 2-4). Table 12 shows the corresponding relevant results for those combos and the 
208-point final testing dataset (which includes the ANN learning/development dataset), 
namely (i) maximum error, (ii) percentage of errors larger than 3%, (iii) performance (all 
described in sub-section 3.4, and evaluated for all learning data), (iv) total number of 
hidden nodes in the model, and (v) average computing time per example (including data 
pre- and post-processing). All results shown in Table 12 are based on target and output 
datasets computed in their original format, i.e. free of any transformations due to output 
normalization and/or dimensional analysis. Summing up the ANN feature combinations 
for all parametric SAs, a total of 219 combos were run for this problem. 

TABLE 11. ANN feature (F) methods used in the best combo from each parametric sub-analysis (SA). 


SA d 

F2 

F3 

F4 

F5 

F6 

F7 

F8 

F9 

F10 

F11 

F12 

F13 

F14 

F15 

1 1 

2 

6 

2 

5 

1 

1 

1 

1 

1 

3 

2 

3 

1 

3 

2 1 

2 

6 

2 

5 

7 

1 

2 

1 

1 

9 

2 

5 

1 

3 

3 1 

2 

6 

1 

5 

1 

1 

1 

1 

1 

3 

2 

3 

1 

3 

4 1 

2 

6 

2 

5 

1 

2 

1 

1 

1 

3 

2 

3 

1 

3 

5 1 

2 

6 

3 

5 

1 

3 

1 

1 

1 

3 

2 

3 

1 

3 

6 1 

2 

6 

1 

5 

7 

4 

1 

1 

1 

3 

2 

3 

1 

3 

7 1 

2 

6 

1 

5 

7 

5 

1 

1 

1 

3 

2 

3 

1 

3 

8 1 

2 

6 

1 

5 

7 

5 

1 

1 

1 

1 

2 

3 

1 

3 

9 1 

2 

6 

1 

5 

7 

5 

1 

3 

3 

1 

2 

3 

1 

3 
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The proposed model is the one, among the best ones from all parametric SAs, exhibiting 
the lowest maximum error (SA 9 - a Neural Network Composite (NNC)). Aiming to 
allow implementation of this model by any user, all variables/equations required for 
(i) data preprocessing, (ii) ANN simulation, and (iii) data postprocessing, are presented 
in the following sub-sections. The proposed model is an NNC made of 4 ANNs with 
architecture MLPN and a distribution of nodes/layer given by 2-3-3-3-1 for every network. 
Concerning connectivity, all networks are fully-connected, and the hidden and output 
transfer functions are all Logistic and Identity, respectively. All networks were trained 
using the LM algorithm. After design, the average NNC computing time concerning the 
presentation of a single example (including data pre/postprocessing) is 4.08E-05 s; Fig. 
13 depicts a simplified scheme of each NNC network. Finally, all relevant performance 
results concerning the proposed NNC are illustrated in sub-section 4.4.4. 

TABLE 12. Performance results for the best design from each parametric SA and for the final testing dataset 
(includes the ANN learning/development dataset): ANN and NNC. 



ANN 

SA 

Max Error 

(%) 

Performance 
All Data 

(%) 

Errors > 3% 

(%) 

Total Hidden 
Nodes 

Running Time/ 
Data Point 
(s) 

1 

20.3 

2.5 

31.3 

8 

1.16E-04 

2 

135.1 

10.3 

61.5 

50 

4.17E-05 

3 

20.3 

2.5 

31.3 

8 

4.16E-05 

4 

23.7 

2.6 

30.8 

8 

4.23E-05 

5 

19.4 

2.6 

29.8 

8 

4.16E-05 

6 

20.3 

2.9 

35.1 

8 

3.50E-05 

7 

18.3 

2.7 

32.7 

8 

3.66E-05 

8 

21.3 

2.7 

34.1 

8 

5.32E-05 

9 

8.8 

0.8 

5.8 

9 

3.55 E-05 




NNC 



SA 

Max Error 

(%) 

Performance 
All Data 

(%) 

Errors > 3% 

(%) 

Total Hidden 
Nodes 

Running Time/ 
Data Point 
(s) 

1 

2 

3 

4 

r 

- 

- 

- 

- 

- 

D 

6 

18.9 

0.9 

5.8 

8 

4.13 E-05 

7 

19.1 

1.2 

10.6 

8 

4.16E-05 

8 

18.4 

1.5 

17.3 

8 

5.70E-05 

9 

5.4 

0.4 

1.9 

9 

4.08E-05 
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i th MLPN (il = 14) 

(NNC computing time = 4.08E-05 s/example) 


FIGURE 13. Proposed NNC made of 4 fully-connected MLPNs - simplified scheme. 

Input Data Preprocessing 

For future use of the proposed NNC to simulate new data Y ]sjm (a 2 x P. m matrix) 
concerning P. m patterns,the same data preprocessing (if any) performed before training 
must be applied to the input dataset.That preprocessing is defined by the methods used 
for ANN features 2, 3 and 5 (respectively 2, 6 and 5 - see Table 2). Next, the necessary 
preprocessing to be applied to Y } sjm is fully described. 

Dimensional Analysis and Dimensionality Reduction 

Since no dimensional analysis (d.a.) nor dimensionality reduction (d.r.) were carried out, 
one has 



(19) 


Input Normalization 

After input normalization, the new input dataset is defined as function of the previously 
determined , and they have the same size, reading 


(Ur = Mr - |NP(: ’ 1 )) - /iNp ( : > 2 ) 


( 20 ) 


INP = 


5000 3171.90409844877 

214.666666666667 22.2385441413878 


where one recalls that operator 7'divides row / in the numerator by INP(/, 2). 

ANN-Based Analytical Model 

Once determined the preprocessed input dataset { Y ]sjm }f er (a 2 x P. m matrix), the 
next step is to present it to the proposed NNC to obtain the predicted output dataset 
{Yf. sjm }f er ( a 1 x ^sim vector )' which will be given in the same preprocessed format of the 
target dataset used in learning. In order to convert the predicted outputs to their "original 
format' (i.e., without any transformation due to normalization or dimensional analysis), 
some postprocessing might be needed, as described in 4.4.3. Next, the mathematical 
representation of the proposed NNC is given, so that any user can implement it to 
determine!/, . } after : 

5 ,sim J n 

(y \ a ft er _ (y , V/y \ after 

\ 1 5,simj n P,»J,(0) Z^{5,simj n(i) (21) 
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being 

^ 3(0 = % (^- 3(0 (Ur + ^- 3 (./ 2 (0 +*3( 0 ) (22) 

^ 4,0 = % (^- 4(0 {lUjr + ^- 4(,)^(0 + ^- 4 ( 0 ^ 3(0 +* 4 (o) 

KXZ = k ( W U» KA7+K 5( »r 2(0 + wl 5(i / m+ wl m r HI) +b m ) 

where / = 0,..., 3 and j 

?2=%=(pA=(Pi.s) = - -— 

1 + e 

(p 5 =(p 5 {s) = s (23) 

Arrays W js!j) and b s{[] are stored online in [20]. 

Output Data Postprocessing 

In order to transform the output dataset obtained by the proposed NNC, {Y 5sim } n after (a 1 
x P. m vector), to its original format (Y 5sin ), i.e. without the effects of dimensional analysis 
and/or output normalization (possibly) taken in target dataset preprocessing prior 
training, one has 

r,*. - {iU}T 00 

since no output normalization nor dimensional analysis were adopted in the proposed 
model. 


: R = 0.99995 



Target 

FIGURE 14. Regression plot for the proposed NNC. 


DOI: http://dx.doi.org/10.18272/aci.v11i2.1365 177 





Arti'culo/Artide 
Seccion/Section C 

11 (17), 154-181 


Potential of neural networks for maximum displacement predictions in railway beams on frictionally damped foundations 


aci 

avances 

encienciase 

ingenierias 


Performance Results 

Finally, the results yielded by the proposed NNC for the 208-point final testing dataset 
(which includes the ANN learning/development counterpart), in terms of performance 
variables defined in sub-section 3.4, are presented in this sub-section in the form of 
two graphs: (i) a regression plot (Fig. 14), where network target and output data are 
plotted, for each data point, as x- and y- coordinates, respectively; and (ii) a plot (Fig. 15) 
indicating (for all data) the (ii.,) maximum error, (ii) percentage of errors larger than 3%, 
and (ii 3 ) average error (called performance). 


5.4% 



max error NNC % errors >3% NNC performance NNC 


FIGURE 15. Maximum and average (performance) errors for the proposed NNC. 


CRITICAL VELOCITIES AND MAXIMUM DISPLACEMENTS PREDICTIONS 

Eleven pairs of curves were obtained as output of the ANN-based models described in 
sub-sections 4.1-4.4. Each pair presents the maximum negative (downward) and positive 
(upward) displacement predictions as function of load velocity (from 50 to 300 m/s in 
intervals of 5 m/s) for different values ofthe maximal distributed friction force f , as depicted 
in Fig. 16 (two plots are presented for the sake of legibility). Note that the classic Winkler 
foundation case corresponds to the frictionless case (f = 0). Comparing the homologous 
curves in Fig.19 and Fig. 5(a) in [4], a very good visual agreement between the ANN-based 
predictions and the results ofthe mechanical model is observed. For a precise comparison, 
the obtained target (FE-based) and output (ANN-based) values can be found in [13]. 

The set of curves shows that the increase ofthe maximum frictional force per unit length 
(f) leads, as expected, to the reduction ofthe displacement peaks. The existence of a 
critical velocity, that is, a velocity that induces the beam's highest displacements, is also 
clear in Fig. 16. It is observed that, for small values off, the value ofthe critical velocity is 
just slightly affected, whereas for larger frictional forces that value clearly rises. 
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FIGURE 16. Maximum upward and downward ANN-based displacements for a fictionally damped beam as 
function of the load velocity: (a) f = 0,2,4,6,8 and 10 kN/m, (b) f = 1,3,5,7 and 9 kN/m. 


DISCUSSION 

In future publications it will be guaranteed that the validation and testing data subsets 
will be composed only by points where at least one variable (does not have to be the 
same for all) takes a value not taken in the training subset by that same variable. Based 
on very recent empirical conclusions by Abambres, the author believes it will lead to 
more robust ANN-based analytical models concerning their generalization ability (i.e. 
prediction accuracy for any data point within the variable ranges of the design data). 


FINAL REMARKS 

This paper demonstrated the potential of Artificial Neural Networks (ANN) to effectively 
predict the maximum displacements and the critical velocities in railway beams under 
moving loads. Four ANN-based models were proposed, one per load velocity range ([50, 
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175] U [250,300] m/s;]175,250[ m/s) and per displacement type (upward or downward). 
Each model is function of two independent variables, a frictional parameter and the 
encfenoafe load velocity (v).The maximum and mean errors found for each aforementioned model 
ingemerias ^ ^ data), when comparing the ANN- and FE-based solutions, were: (i) {2.7, 0.4}% for 
downward displacements when v = [50,175] U [250,300] m/s, (ii) {5.2,1.1}% for downward 
displacements when v = ] 175,250[ m/s, (iii) {3.7, 0.2}% for upward displacements when 
v = [50, 175] U [250, 300] m/s, and (iv) {5.4, 0.4}% for upward displacements when v 
- ] 175, 250[ m/s. Furthermore, whereas the FE-based solution involves an average 
computing time per data point of thousands of seconds, the ANN counterpart does not 
even need a millisecond. More versatile ANN-based analytical models for the same type 
of problem may follow from this study by including more independent variables, such 
as the foundation stiffness modulus, the applied load magnitude, and the geometrical/ 
mechanical properties of the railway beam. 
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